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CLEAVAGE OF CAULOBACTER PRODUCED 
RECOMBINANT FUSION PROTEINS 



5 FIELD OF INVENTION 

This invention relates to the expression and secretion of recombinant fusion 
proteins from Caulobacter wherein a heterologous polypeptide is fused with all or part 
of the surface layer protein (S-layer protein) of the bacterium. 

10 

BACKGROUND OF THE INVENTION 

Many bacteria assemble layers composed of repetitive, regularly aligned, 
proteinaceous sub-units on the outer surface of the cell. These layers are essentially 

15 two-dimensional paracrystalline arrays, and being the outer molecular layer of the 
organism, directly interface with the environment. In Caulobacter , the S-layer protein 
is synthesized by the cell in large quantities and the S-layer completely envelops the cell 
and thus appears to be a protective layer. 

Caulobacter are natural inhabitants of most soil and freshwater environments 

2 0 and may persist in waste water treatment systems and effluents. The bacteria alternate 
between a stalked cell that is attached to a surface, and an adhesive motile dispersal cell 
that searches to find a new surface upon which to stick and convert to a stalked cell. 
The bacteria attach tenaciously to nearly all surfaces and do so without producing the 
extracellular enzymes or polysaccharide "slimes" that are characteristic of most other 

2 5 surface attached bacteria. Caulobacters have simple requirements for growth. The 

organism is ubiquitous in the environment and has been isolated from oligotrophic to 
mesotrophic situations. They are known for their ability to tolerate low nutrient level 
stresses, for example, low phosphate levels. 

All of the freshwater Caulobacter that produce an S-layer are similar and have 

3 0 S-layers that are substantially the same under election microscopy. The layers are 

hexagonally arranged in all cases, with a similar centre-centre dimension (see: Walker, 
S.G., et al... (1992). "Isolation and Comparison of the Paracrystalline Surface Layer 
Proteins of Freshwater Caulobacters" J. Bacteriol. 174: 1783-1792). 



wo 00/04170 



-2- 



PCT/CA99/00637 



16S rRNA sequence analysis of several S-layer producing Caulobacter strains 
show that they group closely (see: Stahl, D A. et al. (1992) "The Phylogeny of Marine 
and Freshwater Caulobacters Reflects Their Habitat" J. Bacteriol. 174: 2193-2198). 
DNA probing of Southern blots using the S-layer gene from C. crescentus CB15 
5 identifies a single band that is consistent with the presence of a cognate gene (see: 
MacRae, J.D. and, J. Smit. (1991) "Characterization of Caulobacters Isolated from 
Wastewater Treatment Systems" AppUed and Environmental Microbiology 57:751- 
758). Furthermore, antisera raised against the S-layer protein of CB15 reacts against 
the S-layer protein of other Caulobacter (see: Walker, S.G. et aL (1992) [supra]). All 

1 0 S-layer proteins isolated from Caulobacter may be substantially purified using the same 
methods. All strains appear to have a polysaccharide species which may be required 
for S-layer attachment (see: Walker, S.G, et al. (1992) [supra]). 

The S-layers elaborated by freshwater isolates of Caulobacter are visibly 
indistinguishable from the S-layer produced by Caulobacter strains CB2 and CB15. 

15 The S-layer proteins from the latter strains have approximately 100,000 m.w. although 
sizes of S-layer proteins from other species and strains will vary. The hydrophillic S- 
layer protein has been characterized both structurally and chemically. It is composed of 
ring-like structures spaced at 22 nm intervals arranged in a hexagonal manner on the 
outer membrane. The S-layer is bound to the bacterial surface and may be removed by 

2 0 low pH treatment or by treatment with a calcium chelator such as EDTA. 

The similarity of S-layer proteins in different strains of Caulobacter permits the 
use of a cloned S-layer protein gene of one Caulobacter strain for retrieval of the 
corresponding gene in other Caulobacter strains (see: Walker, S.G. et al. (1992) 
[supra]; and MacRae, J.D. et al. (1991) [supra]), 

2 5 Expression of a heterologous polypeptide as a fiision product with the S-layer 

protein of Caulobacter provides advantages not previously seen in systems for 
production of recombinant fusion proteins using other organisms such as E. coli and 
Salmonella . All known Caulobacter strains are believed to be harmless and are nearly 
ubiquitous in aquatic environments. In contrast, many Salmonella and E. coli strains 

3 0 are pathogens. Consequently, expression and secretion of a heterologous polypeptide 

using Caulobacter as a vehicle has the advantage that the expression system will be 
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Stable in a variety of outdoor environments and may not present problems associated 
with the use of a pathogenic organjsm. Furthermore, Caulobacter are natural biofilm 
forming species and may be adapted for use in fixed biofilm bioreators. The quantity of 
S-layer protein that is synthesized and is secreted by Caulobacter is high, reaching 12% 
5 of the cell protein. 

There is an existing need to produce pure proteins and peptides in an economical 
manner and in a manner that minimizes or simplifies the purification steps needed after 
fermentation. Key commercial areas include the production of recombinant human and 
animal therapeutic antibiotic and vaccine peptides, industrial enzymes, protein 

10 polymers, and antibacterial enzymes for foodstuffs. Many of these commercial 
applications require low production costs and there are few expression systems available 
that can meet such cost restraints. In addition, there are numerous research applications 
where rapid methods to produce and purify proteins are needed to facilitate the 
discovery stage. This is especially true where there is a desire to express a large 

15 number of proteins with unknown function (from a collections of cloned cDNA's, for 
example) or a large number of variants of a single protein, (for example, resulting from 
site directed mutagenesis) in a search for variants with improved properties. 

Generally, proteins must be secreted to be produced at low cost. The primary 
reason is the much reduced cost of purification of the target protein from cell material. 

2 0 However, even for secreted proteins, simple methods of separating the product from 
spent culture and cells are important for cost reduction and ease of use. 

An international patent application published as WO 97/34000 on September 18, 
1997 describes the expression and secretion of recombinant proteins from Caulobacter 
in which the recombinant protein is a fusion of all or part of Caulobacter S-layer protein 

2 5 with a heterologous protein of interest (also see: Bingle, W.H., et al. 1997* "Linker 

Mutagenesis of the Caulobacter us S-layer protein: Toward a Definition of an N- 
terminal Anchoring Region and a C-terminal Secretion Signal and the Potential for 
Heterologous Protein Secretion". J. Bacteriol. 179:601-611). 

The Caulobacter S-layer secretion apparatus is in the category of "Type 1" 

3 0 secretion usually found in pathogenic bacteria and noted for its ability to secrete a wide 

variety of proteins including large and hydrophillic proteins. The Caulobacter protein 



\ 
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secretion system is particularly useful to secrete recombinant proteins. 

The Caulobacter S-layer Type 1 secretion pathway requires only a C-terminal 
secretion signal, typically comprising about 200 amino acids at the end of the protein. 
The export mechanism is capable of tolerating a wide variety of foreign proteins. 
5 Recombinant proteins may be conveniently produced as fusion proteins with the target 
protein being fused to the C-terminal secretion signal. Depending on the application, it 
may be desirable to remove the secretion signal following secretion. Not removing the 
secretion signal may be an approach suitable for many subunit vaccine applications, 
where the remaining S-layer protein serves as a carrier. 

10 A unique and desirable feature of fusion proteins produced by the Caulobacter 

S-layer protein secretion system is that they form insoluble aggregates in the culture 
medium. This is apparently a consequence of the S-layer sequences associated with 
secretion signal and reflects the fact that the protein normally self-assembles into a two 
dimensional crystalline layer on the bacterium's surface. These aggregates are visible 

15 to the naked eye and are readily collected by simple filtration. With simple water wash 
steps, residual bacterial cells are readily flushed away. It is routinely possible to 
achieve a protein purity of 90% or better with this simple purification procedure. 



DESCRIPTION OF THE PRIOR ART 



20 



Most current protein purification systems for recombinant proteins produced by 
bacteria rely upon an affinity matrix to achieve separation of the target protein and to 
concentrate the protein for subsequent steps of purification. To accomplish this, genes 
for recombinant proteins are commonly constructed so that they contain affinity tags, 
2 5 which are protein sequences that will bind to an affinity matrix. Conmionly used 
systems include the following: 

(a) glutathione S-transferase (GST) tag, which binds to glutathione-sepharose 
matrices; 



30 



(b) maltose binding protein (MB?) tag, which binds to amy lose matrices; 
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(c) multiple tandem histidine residues (e.g. "His-6") tag, which binds to 
nickel-derivatized soUd matrices; and 

5 (d) protein A tag, which binds to Immunoglobuhn IgG-derivatized sepharose or 
comparable matrices. 

Prior art techniques were typically developed so that removal of a target protein 
does not disrupt the tag and matrix association. Instead, enzymes that cleave specific 

10 sequences of amino acids are employed. The enzyme cleavage sequence is positioned 
between the tag and the desired recombinant protein and enzymatic cleavage is effected 
directly on the matrix with attached fusion protein. If a secretion signal is used, the 
cleavage site is usually positioned such that the secretion signal is separated fi-om the 
target recombinant protein during the cleavage step. The matrix is regenerated for re- 

15 use only after the target recombinant protein has been purified away from the matrix. 
Typical enzymes used in these methods are Factor Xa, enterokinase and coUagenase. 

Chemical cleavage is generally not used because the conditions required for 
cleavage will disrupt the binding of affinity tag and matrix or destroy the matrix. When 
chemical cleavage is used with recombinant fusion proteins to cleave target protein from 

2 0 a secretion signal and/or affmity tag, solubilization and denaturation processes are 
generally employed. The expectation is that complete or nearly complete unfolding of 
the protein is a prerequisite for effective cleavage. 

Mild-acid cleavage is predicated on the inclusion, by happenstance or design, of 
the acid-sensitive aspartate-proline dipeptide at a desired site for cleavage. The protein 

2 5 to be cleaved is typically exposed to conditions that solubilize and/or completely 

denature the protein prior to cleavage. The chaotropic agent guanidine hydrochloride 
(used at 6-7 M) is commonly employed to denature and solubilize the protein prior to, 
or at the same time as acid treatment. Alternately, high concentrations of acids that also 
serve as solubilizing agents (as examples: 70-90% formic acid, acetic acid [10%] 

3 0 pyridine, or relatively high concentrations of HCL (60 mM or more) are employed. 

Because such conditions would disrupt a tag/affmity matrix association, direct cleavage 
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of an affinity tag from the target protein while a protein remains associated with an 
affmity matrix is not attempted. 

General conditions for cleavage at aspartate - proline sites are described in 
5 Current Protocols in Molecular Biology (supp. 28; chapter 16.4) John Wiley & Sons 
Inc. 1994, and in Landon, M. "Cleavage at Aspartyl - Prolyl Bonds" in Methods in 
Enzymology (1977) 47: 145-149. These references suggest that significant variability 
of cleavage conditions exist for different proteins and that cleavage might occur in some 
instances without first denaturing or solubilizing the protein. However, in practice, the 
10 latter circumstances are rare and proteins to be subjected to acid cleavage at Asp-Pro 
dipeptides are usually solubilized to a state where there is no visible turbidity. Such 
solubilized protein will normally not pellet when centrifuged at 100,000 x g for 1 hour. 
It is now shown that mild-acid conditions may be used for cleavage of aspartate-proline 
sites in Caulobacter S-layer fusion proteins without placing the protein m a solubilized 
15 state as described above. 

SUMMARY OF BNTVENTION 

This invention is based on the unexpected discovery that recombinant fusion 
2 0 proteins produced by the Caulobacter S-layer protein secretion system can be cleaved 
under mild-acid conditions and solubilization of the fusion protein is not required. 
Cleavage may be accomplished while the fusion protein is in the form of an insoluble 
aggregate typical of the Caulobacter S-layer protein. . Cleavage occurs at aspartate- 
protein dipeptides which may be in a heterologous protein portion of the fusion protein 

2 5 or in a portion that is native to the Caulobacter S-layer portion. The dipeptide may be 

placed at a desired location for cleavage by engineering DNA encoding the fusion 
protein to express the dipeptide at the desired location. A preferable location for 
cleavage may be at or near the junction between a heterologous (target) protein and the 
Caulobacter S-layer portion comprising the Caulobacter secretion signal, such that a 

3 0 cleavage product will be the target protein in its entirety and substantially free of 

extraneous amino acids. 
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The current invention makes it possible to cleave a heterologous (target) protein 
from the S-layer protein portion using only mild-acid conditions, even while the fusion 
protein is in an aggregated form. These cleavage conditions do not result in significant 
solubilization of the S-layer protein portion. 

5 

This invention provides a method of cleaving a fusion protein including a first 
component which comprises all or part of a Caulobacter S-layer protein including a 
Caulobacter C-terminal secretion signal, and a second component heterologous to 
Caulobacter. The fusion protein contains at least one aspartate-proUne dipeptide. The 

10 method comprises combining the fusion protein with an acid solution of a strength 
insufficient to solubilize the fusion protein for a time sufficient for cleavage of the 
fusion protein at the aspartate-proline dipeptide. The acid solution may have a pH of 
from about L5 (eg. 1.5 ± 0.1) to about 2.5 (eg. 2.5 ± 0.1), and preferably from about 
1.65 (eg. 1.65 ± 0.05) to about 2.35 (eg. 2.35 ± 0.05). Preferred pH conditions may 

15 be achieved using an acid equivalent in the range of about 5 to about 20 mM HCL. 
The method is typically carried out at a temperature in the range of approximately room 
temperature to about 50°C. 

This invention also provides a method of preparing a DNA constract suitable for 
expression of a fusion protein suitable for use in the method of this invention. The 

2 0 method comprises joining an upstream DNA segment including DNA heterologous to 
Caulobacter which includes a protein of interest to a downstream DNA segment 
including DNA for a Caulobacter C-terminal secretion signal which does not encode an 
aspartate-proline dipeptide. The upstream segment contains DNA encoding an 
aspartate-proline dipeptide at or near the junction between said upstream and 

2 5 downstream segments . 

This invention also provides a method of preparing a fusion protein, comprising 
the steps of expressing a DNA construct as described above in Caulobacter and 
recovering said fusion protein once secreted by the Caulobacter. 

Once cleavage is accomplished according to this invention, the S-layer portion 

3 0 comprising the Caulobacter secretion signal may remain as an insoluble aggregate. If 

the target protein is soluble, the S-layer portion may be easily separated from the target 
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recombinant protein by simple centrifugation or filtration methods. Thus the system of 
this invention facilitates separation as would a Tag/affinity matrix system except that 
here, the system is also the means for producing an insoluble matrix. In addition, the 
insoluble matrix produced by this invention is resistant to the effects of the acid 
5 treatment, allowing direct cleavage of the target recombinant protein. In this way, a 
very inexpensive chemical cleavage method can be employed to economically retrieve 
recombinant proteins from a bacterial fusion protein. In contrast to the cost of most 
affinity matrices, there is little expense associated with the use of the S-layer secretion 
signal as it is simply a part of the fermentation/secretion process. 

.0 

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 



Production of Recombinant Fusion Proteins Using 
the Caulobacter S-layer Secretion System 

15 

Proteins may be produced using the Caulobacter S-layer Type 1 secretion 
pathway which requires only the C-terminal secretion signal of the Caulobacter . This 
signal is the C-terminal portion of the S-layer protein, which typically comprises about 
200 amino acids. (See: Bingle, etaL (1997) [supra]; and, WO 97/34000). Additional 

2 0 Caulobacter S-layer DNA upstream from the secretion signal may also be present and 

may be desirable to encode portions of the S-layer protein which will contribute to 
aggregate formation of the secreted protein. Such additional Caulobacter DNA may 
constimte most or all of the remainder of the DNA encoding the S-layer protein. 

Standard techniques (such as methods described in WO 97/34000) may be used 
25 to identify the amount of the C-terminal portion of a particular Caulobacter S-layer 
protein which functions as the secretion signal. 

Creation of fusion proteins is commonly done by preparing DNA which codes 
for the target protein and fusing it in-frame with the C-terminal region of the S-layer 
gene. There are numerous possible methods, with the following being examples. 

3 0 1. Oligonucleotide Chemical Synthesis. This involves the design of 

complementary single strands, complete with desirable restriction endonuclease cut sites 
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at the ends, chemical synthesis of the strands followed by annealing, cloning into a 
plasmid vector, juxtaposed to an appropriate portion of the C-terminal region of the S- 
layer gene. 

2. Production of the Target Gene DNA by Polymerase Chain Reaction (PGR) 
5 Amplification of a Target Sequence. In this case, appropriate in-frame restriction 

sites are incorporated into the short oligonucleotides used for amplification of a target 
sequence, such that the final PGR product can be treated with the appropriate restriction 
enzymes (to create the restriction site "sticky ends"), followed by cloning into a plasmid 
vector, juxtaposed to an appropriate portion of the G-terminal region of the S-layer 
10 gene. 

3. Adapting Restriction Endonuclease Gleavage Sites that are Native to a 
Target Protein Gene Sequence for Fusion to the DNA Coding for the G-terminal S- 
layer Secretion Signal to Accomplish In-frame Expression of a Chimeric Protein. 

15 This can be accomplished by direct ligation (although it is uncommon that an 
appropriate match will occur), or the use of adapter sequences or methods involving 
blunting of a restriction site and subsequent blunt-end ligation to change expression 
reading frame or join unlike restriction site sticky ends. 

There will be numerous convenient sites for fiision with the C-terminal regions 

2 0 of the S-layer that lead to the successful expression, secretion and aggregation of a 
recombinant fusion protein. Some example positions are at or near the DNA sites 
corresponding to amino acids 622, 690, 784, 892 and 907 of the C. crescentus S-layer 
gene (see: Appendix 1 and, WO 97/34000). Other sites of fusion with the S-layer gene 
may also be employed. Most often a plasmid vector is designed such that the C- 

2 5 terminal gene segment is resident on a plasmid with appropriate restriction sites placed 

at the N-terminal junction of the S-layer fragment. Target recombinant protein gene 
segments are then cloned into those restriction sites. It is typical to prepare initial 
plasmid constmcts tfiat are replicated in Exoli. After a constmct is produced, it is 
typically transferred to a broad host range plasmid which can then be introduced into 

3 0 the appropriate Caulobacter strain by electroporation. Suitable broad host range 

plasmids can be constructed from (but are not limited to) the IncQ, IncW and IncPl 
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plasmid incompatibility groups. 

The introduction of the aspartate-proline (Asp-Pro) dipeptide at the appropriate 
site in the fusion protein can be done in several ways. Some examples are: 



dipeptide into the oligonucleotides used to prepare the target sequence, either by 
oligonucleotide synthesis or PGR methods; 

(b) preparing a DNA segment with appropriate restriction sites at the termini 
10 so that an Asp-Pro dipeptide can be introduced (most often at the junction between S- 

layer and target gene) after a ftision recombinant S-layer gene has been made; and 

(c) use of a native Asp-Pro dipeptide in either the target DNA or the S-layer 
segment (for example, an Asp-Pro dipeptide is located at amino acids 692 and 693 of 
the C. crescentus S-layer gene and is suitable for ftisions made at the amino acid site). 

15 The methods described above are not the only methods that may be used for 

creating and expressing fusion recombinant S-layer proteins, nor is it necessary to have 
the engineered genes resident on a plasmid. For example, the expressed gene may be 
introduced into the chromosome (using well-known gene insertion or replacement 
techniques) and still achieve secretion of the recombinant proteins (see WO 97/34000). 

2 0 In some cases it may be desirable to produce recombinant fusion proteins as insertions 
of heterologous DNA in the middle of the S-layer gene. In such a case, Asp-Pro 
dipeptide sequences could be engineered at the N and C-termini of the target peptide. 

All possible codon combinations for Asp-Pro will work but the CCA codon for 
proline is not preferred due to the likelihood of a low amount of the corresponding 

2 5 tRNA being present in Caulobacter . The following is an approximate usage table for 
C. crescentus. 



5 



(a) 



incorporating a DNA sequence necessary to express the Asp-Pro 
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TABLE 1 



Caulobacter crescentus Codon Usage Table 

[Amino Acid] [Triplet Code] [Frequency Per Thousand] 



10 



15 



Phe UUU 
Phe UUC 
Leu UUA 
Leu UUG 


2.5 
27.0 
0.0 
4.4 


Ser UCU 
Ser UCC 
Ser CA 
Ser UCG 


1.2 
8.5 
1.2 
25.7 


Try UAU 
Try UAC 
STOP UAA 
STOP UAG 


6.6 
9.6 
0.8 
0.6 


Cys UGU 
Cys UGC 
Cys UGA 
STOP UGG 


0.6 
5.5 
1.6 
7.2 


Leu CUU 
Leu cue 
Leu CUA 
Leu CUG 


4.4 
15.7 
1.1 
72.3 


Pro ecu 
Pro CCC 
Pro CCA 
Pro CCG 


2.5 
15.5 
0.9 
27.1 


His CAU 
His CAC 
Gin CAA 
Gin CAG 


3.2 
12.2 
3.7 
30.2 


Arg CGU 
Arg CGC 
Arg CGA 
Arg CGG 


7.6 
44.7 
3.0 
12,1 


MeAUU 
He AUC 
lie AUA 
Met AUG 


2,4 
49.0 
0.3 
25.7 


Thr ACU 
Thr ACC 
ThrACA 
Thr ACG 


1,2 
37.3 
0.8 
16.8 


Asn AAU 
Asn AAC 
Lys AAA 
LysAAG 


4.1 
23.8 
2.7 
37.9 


SerAGU 
Ser AGC 
Arg AGA 
Arg AGG 


0.8 
14.9 
0.4 
1.1 


Val GUU 
Val GUC 
Val GUA 
Val GUG 


5.4 
42.7 
1.0 
30.7 


Ala GCU 
Ala GCC 
Ala GCA 
Ala GCG 


9.5 
84.1 
2.2 
36.7 


Asp GAU 
Asp GAC 
Glu GAA 
Glu GAG 


11.1 
48.5 
20.5 
45,4 


Gly GGU 
Gly GGC 
Gly GGA 
Gly GGG 


9.5 
64.8 
2,3 
7.7 
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Large quantities (eg. 12% of total cell protein/3% of input organic carbon) of a 
wide range of proteins can be produced, with yields in the order of 250 mg/liter of 
batch culture. Fusion proteins with 35 kDa of target peptide are secreted with little 
5 difficulty, although proteins with multiple cysteines may be more difficult to express. 
Post-expression glycosylation of proteins does not occur, an advantage for most peptide 
expression applications. 



1 0 Host Expression Strains 

For secretion of recombinant fusion S-layer proteins, the Caulobacter strain will 
preferably be one which has lost the ability to produce a native S-layer protein, while 
retaining a fully functional S-layer protein secretion apparatus. Such strains may be 

15 obtained by screening for mutants that have spontaneously become S-layer protein 
negative; or, by directed genetic manipulation, such as (but not limited to) the insertion 
of a drug resistance cassette in the middle of the S-layer gene or the substitution of a 
version of the S-layer gene which has had a sizeable internal region deleted from the 
gene (see: Bingle et al. 1997' [suprd\\ Bingle et al. 1997^ "Cell Surface Display of a 

2 0 Pseudonomonas aerugenosa PAK Pilin Peptide with the Paracrystalline Layer of 
Caulobacter crescentus " Molec. Microbiol. 26:277-288; and, Edwards and Smit (1991) 
" A Transducing Bacteriophage for Caulobacter us Uses the Paracrystalline Surface 
Layer Protein as a Receptor" J. Bacteriol. 173: 5568-5572). In the case of a genetic 
manipulation, a common method for producing such strains is to modify a copy of the 

2 5 S-layer gene while on a plasmid and then to use well known gene replacement methods 

to substitute the modified gene for the native gene in the Caulobacter chromosome (see: 
Edwards and Smit (1991) [supra]). 

If an entire S-layer gene is to be used for production of a recombinant protein 
(via insertion of a target sequence), strains defective in the production of the 

3 0 lipopolysacharide (LPS) used for S-layer attachment to the bacterial surface can be 

used. These can be prepared by forcing Caulobacter to grow without exogenous 
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calcium. Under these conditions mutants arise that are unifomily defective in 
producing a proficient version of the S-layer LPS (see: Walker, S.G. et aL (1994) 
"Characteristics of Mutants of Caulobacter crescentus Defective in Surface Attachment 
of the Paracrystaline Layer" J. Bacteriol. 176: 6312-6323). 



may isolate the S-layer gene from a particular strain (using homology between 
Caulobacter S-layers to design probes to detect and clone the S-layer genes) and adapt 
the C-terminal region for recombinant protein expression, in a manner similar to that 
done for C. crescentus strams (see: MacRae and Smit (1991) [supra], and Walker, S.G. 

10 et al. (1992) [supra]). Alternatively, one may construct recombinant fusion S-layer 
genes using the C. crescentus S-layer gene and express the recombinant genes in 
alternate Caulobacter hosts. 

Freshwater Caulobacter producing S-layers may be readily detected by negative 
stain transmission electron microscopy techniques. Caulobacter may be isolated using 

15 the methods outlined by MacRae and Smit (1991) [supra], which take advantage of the 
fact that Caulobacter can tolerate periods of starvation while other soil and water 
bacteria may not and that they all produce a distinctive stalk structure, visible by light 
microscopy (using either phase contrast or standard dye staining methods). Once 
Caulobacter strains are isolated in a typical procedure, colonies may be suspended in 

2 0 2% ammonium molybdate negative stain and applied to plastic-filmed, carbon-stabilized 
300 or 400 mesh copper or nickel grids and examined in a transn[USsion electron 
microscope at 60 kilovolt accelerating voltage (see: Smit, J. (1986) "Protein Surface 
Layers of Bacteria", in Outer Membranes as Model Systems , (M. Inouge, ed. J.Wiley 
& Sons, at p, 343-376). S-layers are seen as two-dimensional geometric patterns most 

2 5 readily on those cells in a colony that have lysed and released their internal contents. 

Recombinant Protein Purification 



5 



All Caulobacter S-layer producing strains are suitable for this technology. One 



Secreted proteins are separated and shed into the culture media as a macroscopic 
3 0 precipitate (the aggregate" referred to herein). The shedding phenomenon is a 
consequence of the absence of the N-terminal region of the S-layer protein in the 
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expressed recombinant protein, or the loss of the lipopolysaccharide species used for S- 
layer attachment by the Caulobacter (see: Walker, S.G. et al. (1994) \suprd\). 
Typically, the aggregate forms as loose, gel-like lumps of pure protein that can readily 
be retrieved and separated from the bacteria by simple filtration. 
5 The aggregate may be readily separated from a soluble cleaved target protein by 

any suitable techniques such as filtration of centrifugation. If the target protein is 
insoluble once cleaved, it may then be convenient to then solubilize one or both of the 
proteins (for example in 8M urea or 6M quanidine HCL) and separate by 
chromatography. In this way, only 2 species of protein need to be separated. 

10 

Cleavage of Fusion Proteins 

General procedures for performing mild-acid cleavage are known from in the 
prior art as described above. In the method of this invention, conditions are adjusted to 
1 5 avoid destruction of the target protein or solubilization of the aggregate containing the 
S-layer secretion signal. Excess acid or too high a temperature may increase the 
occurrence over lime of random cleavages along the length of the fusion protein, which 
is to be avoided since such random cleavages may lead to undersized fragmentation of 
the fusion protein or solubilization of the aggregated S-layer portion. 

20 

Good yields of target protein with minimum random breaks in the fusion protein 
may generally be achieved by using from 5-20 mM HCL (or its equivalent while 
employing another acid). The respective pH of these conditions (unbuffered acid 
solution) is from about 2.3 to about 1.7. Time and temperature is preferably adjusted 

2 5 by routine monitoring to achieve the desired cleavage while minimizing random breaks. 

For example, temperature may range from room temperature to about 50*^ C. Time of 
treatment may range ft-om about 12 to about 72 hours. Time or temperature outside of 
these ranges is permissible depending upon the strength of the acid and the accepted 
yield. Generall\\ lower yields are obtained with less acid strength, less time or lower 

3 0 temperatures. 

In the following examples, efficiency of cleavage in the order of 40-80% is 
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achieved using conditions the same as or similar to the following alternatives: 

- 5 mM HCL at 50^ C. for 48-72 hours 

- 20 mM HCL at 30' C. for 48-72 hours. 

Conditions in excess of the aforementioned values may be employed in some 
5 cases with the possibility of random breaks increasing, particularly with increased acid 
strength or temperature. In the following examples, significant random cleavage 
occurred with 50 mM HCL at 50*^ C. after 48 hours. 

Any acid may be employed in this invention which is normally used in solutions 
to which proteins are exposed. Acids which have a deleterious effect on proteins under 
10 dilute conditions should be avoided. For example, HCL or an equivalent amount of 
H-,S04 may be used in this invention but oxidizing acids such as nitric acid may not be 
suitable. 

Example 1 , Cleavage of artificial silk protein sequences 
15 from a secretion signal containing a native aspartate-proline cleavage site. 

An artificial protein sequence resembling spider silk was constructed by 
synthesis of partially overlapping and complementing oligomers of DNA, which were 
then completed to a full duplex DNA with Taql polymerase extension, to create a 
2 0 sequence that coded for 97 amino acids. The resulting DNA sequence and 
corresponding amino acid sequence are shown in Appendix 2. 

The DNA sequence shown in Appendix 2 was cloned into a gene carrier 
sequence residing in a pUC8 plasmid cloning vector. The gene segment carrier had 
BamHl restriction sites at each end and an internal Bglll site. This combination of 

2 5 restrictions sites allowed the production of multimers of the above sequence, relying on 

the fact that BamHl sticky ends will ligate into Bglll sticky end, with the loss of both 
restriction sites. Thus one copy of the silk-like sequence within the gene segment 
carrier can be put inside a second copy of the same to produce a dimer. Using this 
principle, an 8X repeat was produced, fused to DNA encoding the S-layer secretion 

3 0 signal corresponding to the C-terminal portion of the C. crescentus S-layer protein from 

about amino acid 690 onwards (see: Appendbc 1). This fusion protein gene was 
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introduced into strain CB2A on a broad host range plasmid vector. The 8x multimer 
appeared to be unstable, resulting in recombination events that reduced the 8X multimer 
to a 3x size. The 3 fold repeat of the above 97 amino acid sequence, fused to the S- 
layer secretion signal was secreted. Protein was collected and subjected to treatment 
5 with 5mM HCL for 2 days at 50° C. The result was the liberation of about 80% of 
soluble silk-like polymer which was readily separated by filtration from the S-layer 
protein which remamed completely aggregated under these conditions. Cleavage 
occurred at native aspartate-proline dimer in the Caulobacter S-Iayer signal region (see: 
Appendix 1, amino acids numbered 692-693). 

10 

Example 2. Cleavage of the salmonid virus Infectious Pancreatic Necrosis 
Virus (IPNV) surface glycoprotein candidate vaccine sequence from an 
S-layer secretion signal containing a native aspartate-proline site. 

15 

The surface glycoprotein of the IPNV strain is a vaccine candidate. For this 
example and Example 4, the sequence of the first 257 amino acids of the mature protein 
and the corresponding DNA sequence as shown in Appendix 3 were used. 

DNA encoding a segment of the major surface glycoprotein gene of IPNV 
2 0 specifying amino acids 145-257 of the protein was fused to DNA sequence specifying 
two putative T-cell activating epitopes: MVF (SEQ ID No:l; LSEIKGVIVHRLEGV, 
derived from Measles Virus protein F) and P2 (SEQ ID No:2; QYIKANSKFIGITEL, 
derived from tetanus toxoid protein). The T-cell epitopes were positioned on the C- 
termihal end of the IPNV sequence. This chimeric protein was in turn fused in frame 

2 5 with the C-crescentus S-layer gene at about amino acid 690 position of the gene and 

introduced into Caulobacter on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50° C. 
Cleavage occurred at the native aspartate-proline dimer described in Example L The 
result was the liberation of about 75% of soluble vaccine candidate chimeric protein 

3 0 from the S-layer secretion signal which remained aggregated. 
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Example 3. Cleavage of segments of an E. coli type I pilus tip subunit from 
an S-layer secretion signal containing a native aspartate-proline cleavage site, 

5 The FimH gene product is the tip pilus subunit of the E. coli strains involved 

with urinary tract infections. Two segments, T3 (specifying the first 145 amino acids 
of the mature peptide) and T7 (specifying the entire 258 amino acids of the mature 
peptide) were fused to the S-layer secretion signal at about amino acid 690 of the 
S-layer sequence. The T3 and T7 sequences are shown in Appendix 4. 

10 The fusion protein genes were introduced into strain CB2A on a broad host 

range plasmid vector. In both cases the resulting secreted protein was collected and 
treated with 5 mM HCL for 2 days at 50° C. In both cases, the result was the liberation 
of about 50% of soluble vaccine candidate chimeric protein from the S-layer secretion 
signal which remained aggregated. Cleavage occurred at the native aspartate-proline 

1 5 dimer described in Example 1 . 

Example 4 . Cleavage of the salmonid virus IPNV surface glycoprotein 
candidate vaccine sequence from an S-layer secretion signal containing 
an introduced aspartate-proline cleavage site. 

20 

A segment of the major surface glycoprotein gene of IPNV specifying amino 
acids 1-257 of the protein shown in Appendbc 4 was fiised to a DNA sequence 
specifying a peptide containing an aspartate-proline dipeptide (SEQ ID No: 3; 
SPLGPAGDPEAS) such that the aspartate-proline dipeptide was positioned very near 

2 5 the C-terminus of the chimeric protein. This chimeric protein was in mrn fused in 

frame with the C. crescentus S-layer gene at about amino acid 784 position of the gene 
and introduced in strain CB2A on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50° C. 
Cleavage occurred at the introduced aspartate-proline dipeptide. The result was the 

3 0 liberation of about 40% of insoluble vaccine candidate chimeric protein from the S- 

layer secretion signal which remained aggregated. 

Longer DNA and amino acid sequences referred to above are set out in the 
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foUowing Appendices which are part of this description. Appendix 1 sets out the 
complete nucieotide sequence of the C. crescentus S-iayer gene (SEQ ID No: 4) with 
the upstream sequence including the -35 and -10 sites of the promoter region and the 
Shine Dalgamo sequence. The start codon is at nucleotide 101 and the coding sequence 
5 run to and includes nucleotide 3179. The amino acid sequence of the C. crescentus S- 
layer protein (SEQ ID No: 5) included in Appendix 1 is predicted from the DNA 
sequence. Appendix 2 sets out the artificial spider silk DNA sequence (SEQ ID No: 6) 
used in Example 1 and the corresponding amino acid sequence (SEQ ID No. 7). 
Appendix 3 sets out the DNA sequence (SEQ ID No: 8) and corresponding amino acid 

10 sequence (SEQ ID No: 9) of the first 257 amino acids of IPNV as described in 
Examples 2 and 4. Appendix 4 sets out the T3 protein sequence (SEQ ID No: 10) and 
the T7 protein sequence (SEQ ID No: 11) as described in Example 3. 

All publications, patents and patent applications referred to herein are hereby 
incorporated by reference. While this invention has been described according to 

15 particular embodiments and by reference to certain examples, it will be apparent to 
those of skill in the art that variations and modifications of the invention as described 
herein fall within the spirit and scope of the attached claims. 
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GCTATTGTCG 


ACGTATGACG 


TTTGCTCTAT 


AGCCATCGCT 


GCTCCCATGC 


^^^^ 




GTCGCAGGGG 


GTGTGGGATT 


TTTTTTGGGA 


GACAATCCTC 


ATGGCCTATA 


CGACGGCCCA 


120 


GTTGGTGACT 


GCGTACACCA 


ACGCCAACCT 


CGGCAAGGCG 


CCTGACGCCG 


CCACCACGCT 


180 


GACGCTCGAC 


GCGTACGCGA 


CTCAAACCCA 


GACGGGCGGC 


CTCTCGGACG 


CCGCTGCGCT 


240 


GACCAACACC 


CTGAAGCTGG 


TCAACAGCAC 


GACGGCTGTT 


GCCATCCAGA 


CCTACCAGTT 


300 


CTTCACCGGC 


GTTGCCCCGT 


CGGC CGCTGG 


TUTCjCiAL. 1 1 L 


C X 1 (.ViiA^. 1 




J o u 


CACCAACGAC 


CTGAACGACG 


CGTACTACTC 


GAAGTTHjL i 


C AGVj AAAAC C 


GC i. 1 CAX Ov\ 


A o n 
ffc ^ u 


CTTCTCGATC 


AACCTGGCCA 


CGGGCGC C GG 




ACGOC 111 Co 


CCGCCGCC xJ\ 


A Q O 
4 O U 


CACGGGCGTT 


TCGTACGCCC 


AGACGGTCGC 




GACAAGA 1 CA 


i. CGGCAACGC 


c A ri 
I) 4 u 


CGTCGCGACC 


GCCGCTGGCG 


TCGACGTCGC 


r^r^ r^r^^r^ 
oCa C C (jL i 


GC i i 1 C C 1 GA 


GCCGCCAGGC 


c n ri 
b U u 


CAACATCGAC 


TACCTGACCG 


CCTTCGTGCG 


CGCCAAUALtj 


CCGi iCACGG 


CCGC IGCCGA 


£ £ A 
O O U 


CATCGATCTG 


GCCGTCAAGG 


CCGCCCTGAT 


CGGC- AC CAT C 


C IGAACGCCG 


CCACGG 1 G 1 C 


"7 n 


GGGCATCGGT 


GGTTACGCGA 


CCGCCACGGC 


CGCGATGATC 


AACGACCTGT 


CGGACGGCGC 


780 


CCTGTCGACC 


GACAACGCGG 


CTGGCGTGAA 


CCTGTTCACC 


GCCTATCCGT 


CGTCGGGCGT 


840 


GTCGGGTTCG 


ACCCTCTCGC 


TGACCACCGG 


CACCGACACC 


CTGACGGGCA 


CCGCCAACAA 


900 


CGACACGTTC 


GTTGCGGGTG 


AAGTCGCCGG 


cgctgcgacc 


CTGACCGTTG 


GCGACACCCT 


960 


GAGCGGCGGT 


GCTGGCACCG 


ACGTCCTGAA 


CTGGGTGCAA 


GCTGCTGCGG 


TTACGGCTCT 


1020 


GCCGACCGGC 


GTGACGATCT 


CGGGCATCGA 


aacgatgaac 


GTGACGTCG6 


GCGCTGCGAT 


1080 


CACCCTGAAC 


ACGTCTTCGG 


GCGTGACGGG 


TCTGACCGCC 


CTGAACACCA 


ACACCAGCGG 


1140 


CGCGGCTCAA 


ACCGTCACCG 


CCGGCGCTGG 


CCAGAACCTG 


ACCGCCACGA 


CCGCCGCTCA 


1200 


AGCCGCGAAC 


AACGTCGCCG 


TCGACGGGCG 


CGCCAACGTC 


ACCGTCGCCT 


CGACGGGCGT 


1260 


GACCTCGGGC 


ACGACCACGG 


TCGGCGCCAA 


CTCGGCCGCT 


TCGGGCACCG 


TGTCGGTGAG 


1320 


CGTCGCGAAC 


TCGAGCACGA 


CCACCACGGG 


CGCTATCGCC 


GTGACCGGTG 


GTACGGCCGT 


1380 


GACCGTGGCT 


CAAACGGCCG 


GCAACGCCGT 


GAACACCACG 


TTGACGCAAG 


CCGACGTGAC 


1440 


CGTGACCGGT 


AACTCCAGCA 


CCACGGCCGT 


GACGGTCACC 


CAAACCGCCG 


CCGCCACCXSC 


1500 


CGGCGCTACG 


GTCGCCGGTC 


GCGTCAACGG 


CGCTGTGACG 


ATCACCGACT 


CTGCCGCCGC 


1560 


CTCGGCCACG 


ACCGCCGGCA 


AGATCGCCAC 


GGTCACCCTG 


GGCAGCTTCG 


GCGCCGCCAC 


1620 


GATCGACTCG 


AGCGCTCTGA 


CGACCGTCAA 


CCTGTCGGGC 


ACGGGCACCT 


CGCTCGGCAT 


1680 
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Appendix 1 (cont'd) 



CGGCCGCGGC GCTCTGACCG CCACGCCGAC CGCCAACACC CTGACCCTGA ACGTCAATGG 
TCTGACGACG ACCGGCGCGA TCACGGACTC GGAAGCGGCT GCTGACGATG GTTTCACCAC 
CATCAACATC GCTGGTTCGA CCGCCTCTTC GACGATCGCC AGCCTGGTGG CCGCCGACGC 
GACGACCCTG AACATCTCGG GCGACGCTCG CGTCACGATC ACCTCGCACA CCGCTGCCGC 
CCTGACGGGC ATCACGGTGA CCAACAGCGT TGGTGCGACC CTCGGCGCCG AACTGGCGAC 
CGGTCTGGTC TTCACGGGCG GCGCTGGCCG TGACTCGATC CTGCTGGGCG CCACGACCAA 
GGCGATCGTC ATGGGCGCCG GCGACGACAC CGTCACCGTC AGCTCGGCGA CCCTGGGCGC 
TGGTGGTTCG GTCAACGGCG GCGACGGCAC CGACGTTCTG GTGGCCAACG TCAACGGTTC 
GTCGTTCAGC GCTGACCCGG CCTTCGGCGG CTTCGAAACC CTCCGCGTCG CTGGCGCGGC 
GGCTCAAGGC TCGCACAACG CCAACGGCTT CACGGCTCTG CAACTGGGCG CGACGGCGGG 
TGCGACGACC TTCACCAACG TTGCGGTGAA TGTCGGCCTG ACCGTTCTGG CGGCTCCGAC 
CGGTACGACG ACCGTGACCC TGGCCAACGC CACGGGCACC TCGGACGTGT TCAACCTGAC 
CCTGTCGTCC TCGGCCGCTC TGGCCGCTGG TACGGTTGCG CTGGCTGGCG TCGAGACGGT 
GAACATCGCC GCCACCGACA CCAACACGAC CGCTCACGTC GACACGCTGA CGCTGCAAGC 
CACCTCGGCC AAGTCGATCG TGGTGACGGG CAACGCCGGT CTGAACCTGA CCAACACCGG 
CAACACGGCT GTCACCAGCT TCGACGCCAG CGCCGTCACC GGCACGGCTC CGGCTGTGAC 
CTTCGTGTCG GCCAACACCA CGGTGGGTGA AGTCGTCACG ATCCGCGGCG GCGCTGGCGC 
CGACTCGCTG ACCGGTTCGG CCACCGCCAA TGACACCATC ATCGGTGGCG CTGGCGCTGA 

CACCCTGGTC TACACCGGCG GTACGGACAC CTTCACGGGT GGCACGGGCG CGGATATCTT 
CGATATCAAC GCTATCGGCA CCTCGACCGC TTTCGTGACG ATCACCGACG CCGCTGTCGG 
CGACAAGCTC GACCTCGTCG GCATCTCGAC GAACGGCGCT ATCGCTGACG GCGCCTTCGG 
CGCTGCGGTC ACCCTGGGCG CTGCTGCGAC CCTGGCTCAG TACCTGGACG CTGCTGCTGC 
CGGCGACGGC AGCGGCACCT CGGTTGCCAA GTGGTTCCAG TTCGGCGGCG ACACCTATGT 
CGTCGTTGAC AGCTCGGCTG GCGCGACCTT CGTCAGCGGC GCTGACGCGG TGATCAAGCT 
GACCGGTCTG GTCACGCTGA CCACCTCGGC CTTCGCCACC GAAGTCCTGA CGCTCGCCTA 
AGCGAACGTC TGATCCTCGC CTAGGCGAGG ATCGCTAGAC TAAGAGACCC CGTCTTCCGA 
AAGGGAGGCG GGGTCTTTCT TATGGGCGCT ACGCGCTGGC CGGCCTTGCC TAGTTCCGGT 



1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 

2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
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Appendix 1 (cont'd) 

Met Ala Tyr Thr Thr Ala Gin Leu Val Thr Ala Tyr Thr Asn Ala Asn 
15 10 15 

Lrau Gly Lys Ala Pro Asn Ala Ala Thr Thr Leu Thr Leu Asp Ala Tyr 
20 ' 25 30 

Ala Thr Gin Thr Gin Thr Gly Gly Leu Ser Asp Ala Ala Ala Leu Thr 
35 40 45 

Asn Thr Leu Lys Leu Val Asn Ser Thr Thr Ala Val Ala lie Gin Thr 
50 55 60 

Tyr Gin Phe Phe Thr Gly Val Ala Pro Ser Ala Ala Gly Leu Asp Phe 
65 70 75 80 

Leu Val Asp Ser Thr Thr Asn Thr Asn Asp Leu Asn Asp Ala Tyr Tyr 
85 90 95 

Ser Lys Phe Ala Gin Glu Asn Arg Phe lie Asn Phe Ser He Asn Leu 
100 105 110 

Ala Thr Gly Ala Gly Ala Gly Ala Thr Ala Phe Ala Ala Ala Tyr Thr 
115 120 125 

Gly Val Ser Tyr Ala Gin Thr Val Ala Thr Ala Tyr Asp Lys He He 
130 135 140 

Gly Asn Ala Val Ala Thr Ala Ala Gly Val Asp Val Ala Ala Ala Val 
145 150 155 160 

Ala Phe Leu Ser Arg Gin Ala Asn He Asp Tyr Leu Thr Ala Phe Val 
165 170 175 

Arq Ala Asn Thr Pro Phe Thr Ala Ala Ala Asp He Asp Leu Ala Val 
180 185 190 

Lys Ala Ala Leu He Gly Thr He Leu Asn Ala Ala Thr Val Ser Gly 
195 200 205 

He Gly Glv Tyr Ala Thr Ala Thr Ala Ala Met He Asn Asp Leu Ser 
210 215 220 

Asp Gly Ala Leu Ser Thr Asp Asn Ala Ala Gly Val Asn Leu Phe Thr 
225 230 235 240 

Ala Tyr Pro Ser Ser Gly Val Ser Gly Ser Thr Leu Ser Leu Thr Thr 
245 250 255 

Gly Thr Asp Thr Leu Thr Gly Thr Ala Asn Asn Asp Thr Phe Val Ala 
260 265 270 

Gly Glu Val Ala Gly Ala Ala Thr Leu Thr Val Gly Asp Thr Leu Ser 
275 280 285 

Gly Gly Ala Gly Thr Asp Val Leu Asn Trp Val Gin Ala Ala Ala Val 
290 295 300 

Thr Ala Leu Pro Thr Gly Val Thr He Ser Gly He Glu Thr Met Asn 
305 310 315 320 

Val Thr Ser Gly Ala Ala He Thr Leu Asn Thr Ser Ser Gly Val Thr 
325 330 335 

Gly Leu Thr Ala Leu Asn Thr Asn Thr Ser Gly Ala Ala Gin Thr Val 
340 345 350 
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Thr Ala Gly Ala Gly Gin Asn Leu Thr Ala Thr Thr Ala Ala Gin Ala 
355 360 

Ala Asn Asn Val Ala Val Asp Gly Arg Ala Asn Val Thr Val Ala Ser 



370 375 
Thr Gly Val Thr Ser Gly Thr Thr Thr Val Gly Ala Asn Ser Ala Ala 
385 390 395 

ser Gly Thr Val Ser Val Ser Val Ala Asn Ser Ser Thr Thr Thr Thr 

Gly Ala lie Ala Val Thr Gly Gly Thr Ala Val Thr Val Ala Gin Thr 

- 425 



420 



Ala Gly Asn Ala Val Asn Thr Thr Leu Thr Gin Ala Asp Val Thr Val 



435 



Thr Gly Asn Ser Ser Thr Thr Ala Val Thr Val Thr Gin Thr Ala Ala 



450 



455 



Ala Thr Ala Gly Ala Thr Val Ala Gly Arg Val Asn Gly Ala Val Thr 
465 

lie Thr ASP ser Ala Ala Ala Ser Ala Thr Thr Ala Gly Lys lie Ala 

490 1^-' 



485 



Thr Val Thr Leu Gly Ser Phe Gly Ala Ala Thr He Asp Ser Ser Ala 
500 505 

Leu Thr Thr Val Asn Leu Ser Gly Thr Gly Thr Ser Leu Gly He Gly 
515 520 52b 

Arg Gly Ala Leu Thr Ala Thr Pro Thr Ala Asn Thr Leu Thr Leu Asn 
530 535 540 

val Asn Gly Leu Thr Thr Thr Gly Ala He Thr Asp Ser Glu Ala Ala 



545 550 



Ala ASP Asp Gly Phe Thr Thr He Asn He Ala Gly Ser Thr Ala Ser 
565 570 ='= 

ser Thr He Ala Ser Leu Val Ala Ala Asp Ala Thr Thr Leu Asn He 
580 585 590 

ser Gly Asp Ala Arg Val Thr He Thr Ser His Thr Ala Ala Ala Leu 
595 60b 

Thr Gly He Thr Val Thr Asn Ser Val Gly Ala Thr Leu Gly Ala Glu 
610 615 620 

Leu Ala Thr Gly Leu Val Phe Thr Gly Gly Ala Gly Arg Asp Ser He 
625 

Leu Leu Gly Ala Thr Thr Lys Ala He Val Met Gly Ala Gly Asp Asp 
645 650 

Thr val Thr Val Ser Ser Ala Thr Leu Gly Ala Gly Gly Ser Val Asn 
660 665 670 

Gly Gly Asp Gly Thr Asp Val Leu Val Ala Asn Val Asn Gly Ser Ser 
675 680 685 

Phe Ser Ala Asp Pro Ala Phe Gly Gly Phe Glu Thr Leu Arg Val Ala 
690 695 
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Glv Ala Ala Ala Gin Gly Ser His Asn Ala Asn Gly Phe Thr Ala Leu 
705 710 715 720 

Gin Leu Gly Ala Thr Ala Gly Ala Thr Thr Phe Thr Asn Val Ala Val 
725 730 735 

Asn Val Gly Leu Thr Val Leu Ala Ala Pro Thr Gly Thr Thr Thr Val 
740 745 750 

Thr Leu Ala Asn Ala Thr Gly Thr Ser Asp Val Phe Asn Leu Thr Leu 
755 760 765 

Ser Ser Ser Ala Ala Leu Ala Ala Gly Thr Val Ala Leu Ala Gly Val 
770 775 780 

Glu Thr Val Asn lie Ala Ala Thr Asp Thr Asn Thr Thr Ala His Val 
785 790 795 800 

Asp Thr Leu Thr Leu Gin Ala Thr Ser Ala Lys Ser He Val Val Thr 
805 810 815 

Glv Asn Ala Gly Leu Asn Leu Thr Asn Thr Gly Asn Thr Ala Val Thr 
^ 820 825 830 

Ser Phe Asp Ala Ser Ala Val Thr Gly Thr Ala Pro Ala Val Thr Phe 
835 840 845 

Val Ser Ala Asn Thr Thr Val Gly Glu Val Val Thr He Arg Gly Gly 
850 855 860 

Ala Glv Ala Asp Ser Leu Thr Gly Ser Ala Thr Ala Asn Asp Thr He 
865 870 875 880 

He Gly Gly Ala Gly Ala Asp Thr Leu Val Tyr Thr Gly Gly Thr Asp 
885 890 895 

Thr Phe Thr Gly Gly Thr Gly Ala Asp He Phe Asp He Asn Ala He 
900 905 910 

Gly Thr Ser Thr Ala Phe Val Thr He Thr Asp Ala Ala Val Gly Asp 
915 920 925 

Lys Leu Asp Leu Val Gly He Ser Thr Asn Gly Ala He Ala Asp Gly 
930 935 940 

Ala Phe Gly Ala Ala Val Thr Leu Gly Ala Ala Ala Thr Leu Ala Gin 
945 950 955 960 

Tyr Leu Asp Ala Ala Ala Ala Gly Asp Gly Ser Gly Thr Ser Val Ala 
965 970 975 

Lys Trp Phe Gin Phe Gly Gly Asp Thr Tyr Val Val Val Asp Ser Ser 
980 985 990 

Ala Gly Ala Thr Phe Val Ser Gly Ala Asp Ala Val He Lys Leu Thr 
995 1000 1005 

Gly Leu Val Thr Leu Thr Thr Ser Ala Phe Ala Thr Glu Val Leu Thr 
1010 1015 1020 

Leu Ala 
1025 
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GAA TTC AGA TCT CAG GGC GCG GGG CAG GGT GGC TAT GGT GGG CTC GGC 

TCG CAA GGC 

GCT 

EFRSQGAGQGGYGGLGSQGA 

GGC CTG GGT GGC CAG GGC GCT GGC GCG GCC GCG GCC GCT GCG GCC GGT 

GRGGQGAGAAAAAAAGG 

GCT GGC CAG GGC GGG CTG GGC TCG CAG GGC GCC GGC CAA GGC GCT GGC 

GCC GCG GCC 

GCT 

AGQGGLGSQGAGQGAGAAAA 

GCG GCC GGT GGC GCC GGC CAG GGT GGC TAC GGC GGC CTG GGC AGC CAG 

GGC GCC GGT 

CGC 

AAGGAGQGGYGGLGSQGAGR 

GGC GGT CAG GGC GCC GGT GCC GCG GCC GCT GCG GCC GGT GGC GCT GGG 
CAA GGC GGC TAC 

GGQGAGAAAAAAGGAGQGGY 

GGC GGT CTG GGA TCC 
G G L G S 
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atg aac aca aac aag gca acc gca act tac ttg aaa tec att atg ctt cca gag act 

hAeX asn thr asn lys ala thr ala thr tyr leu lys ser ile met leu pro glu thr 

gty 

61/21 

cca gca age ate ccg gac gac ata acg gag aga cac aic tta aas cas gag acc teg 
tea 

pro ala ser ile pro asp asp ile thr glu arg his ile leu lys gin glu thr ser 

ser 

121/41 

tac aac tta gag gtc tec gaa tea gga agt ggc att ctt gtt tgt ttc cct ggg gca 

tyi^asn leu glu val ser glu ser gly ser gly ile leu val cys phe pro gly ala 
pro 

181/61 

ggc lea egg ate ggt gca cac tac aga tgg aat grg aac cag acg ggg ctg gag ttc 

gfyser arg ile gly ala his tyr arg trp asn ala asn gin thr gly leu glu phe 

asp 

241/81 

cag tgg ctg gag acg teg cag gac ctg aag aaa gee ttc aac tac ggg agg ctg ate 

gin trp leu glu thr ser gin asp leu lys lys ala phe asn tyr gly arg leu ile 
ser 

301/101 

agg aaa tac gac att caa age tec aca eta ccg gee ggt etc tat get ctg aac ggg 

a^ lys tyr asp ile gin ser ser thr leu pro ala gly leu tyr ala leu asn gly 
thr 

361/121 

etc aac get gee acc ttc gaa ggc agt ctg tct gag gtg gag age ctg acc tac aat 
age 

leu asn ala ala thr phe glu gly ser leu ser glu val glu ser leu thr tyr asn 
ser 

421/141 

ctg atg tec eta act acg aac cec cag gac aaa gee aac aac cag ctg gtg acc aaa 
gga 

leu met ser leu thr thr asn pro gin asp lys ala asn asn gin leu va! thr lys 

gty 

481/161 

gtc acc gtc ctg aat eta cca aca ggg ttc gac aaa cca tac gtc cgc eta gag gac 
gag 

val thr val leu asn leu pro thr gly phe asp lys pro tyr val arg leu glu asp 
glu 

541/181 

aca cce cag ggt etc cag tea atg aac ggg gcc agg atg agg tgc aca get gca att 
gca 

thr pro gin gly leu gtn ser met asn gly ala arg met arg cys thr ala ala ile 
aid 

601/201 

cca egg agg tac gag ate gac etc cca tec caa age eta cec cec gtt cct gcg aca 
gga 

pro arg arg tyr glu iie asp leu pro ser gin ser leu pro pro val pro ala thr 

gly 

661/221 

acc etc acc act etc tac gag gga aac gcc gac ate gtc age tec aca aca gtg acg 
gga 

thr leu thr thr leu tyr glu gly asn ala asp ile val ser ser thr thr val thr 

gly 

721/241 

gac ata aac ttc agt ctg gca gaa cga cce gca aac gag acc agg ttc gac ttc cag 
ctg 

asp ile asn phe ser leu ala glu arg pro ala asn glu thr arg phe asp phe gin 
leu 
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WHAT IS CLAIMED IS: 

A method of cleaving a fusion protein including a first component which comprises all 
or part of a Caulobacter S-layer protein including a Caulobacter C-terminal secretion 
signal, and a second component heterologous to Caulobacter, the fiision protein 
containing at least one aspartate-proline dipeptide, wherein the method comprises 
combining the fusion protein with an acid solution of a strength insufficient to 
solubilize the fusion protein for a time sufficient for cleavage of the fusion protein at 
said aspartate-proline dipeptide. 

The method of claim 1 wherein a aspartate-proline dipeptide is situated between the 
first and second components or adjacent a junction between the first and second 
components. 

The method of claim 1 or 2, wherein the acid solution has a pH of from about L5 to 
about 2.5. 

The method of claim 1 or 2, wherein die acid solution has a pH of about 1.65 to about 
2.35, 

The method of any one of claims 1-4 wherein the method is carried out at a 
temperature in the range of about 30** C. to about 50° C. 

The method of any one of claims 1-5. wherein the method further comprises 
separating products cleaved from the ftision protein. 



A method of preparing a DNA construct for expression of a fusion protein suitable for 
use in the method of claim 1, wherein the method comprises joining an upstream 
DNA segment including DNA heterologous to Caulobacter which encodes a protein 



mm 
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of interest, to a downstream DNA segment including DNA for a Caulobacter C- 
terminal secretion signal, wherein the downstream DNA segment does not encode an 
aspartate-proline dipeptide, and wherein the upstream segment contains DNA 
encoding an aspartate-proline dipeptide at or near an end of said upstream segment to 
5 be joined to said downstream segment. 

8. A method of preparing a fusion protein, comprising: 



(1) expressing a DNA construct prepared as described in claim 7 in 
Caulobacter and. 



10 



(2) recovering said fusion protein secreted by the Caulobacter. 
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